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Abstract 

Most  current  AI  technology  has  been  based  on  proposition  ally  represented  theoretical  knowledge.  We 
argue  that  if  AI  is  to  accomplish  its  goals,  especially  in  the  tasks  of  sensory  interpretation  and  sensorimotor 
coordination,  then  it  must  solve  the  problem  of  representing  embodied  practical  knowledge.  Biological  evi- 
dence shows  that  animals  use  this  knowledge  in  a  way  very  different  from  digital  computation.  This  sug- 
gests that  if  these  problems  are  to  be  solved,  then  we  will  need  a  new  breed  of  computers,  which  we  call 
field  computers.  Examples  of  field  computers  are:  neurocomputers,  optical  computers,  moleculax  comput- 
ers, and  any  kind  of  massively  parallel  analog  computer.  We  claim  that  the  principle  characteristic  of  all 
these  computers  is  their  massive  parallelism,  but  we  use  this  term  in  a  special  way.  We  argue  that  true 
massive  parallelism  comes  when  the  number  of  processors  is  so  large  that  it  can  be  considered  a  continuous 
quantity.  Designing  and  programming  these  computers  requires  a  new  theory  of  computation,  one  version 
of  which  is  presented  in  this  paper.  We  describe  a  universal  field  computer,  that  is,  a  field  computer  that 
can  emulate  any  other  field  computer.  It  is  based  on  a  generalization  of  Taylor's  theorem  to  continuous 
dimensional  vector  spaces.  A  number  of  field  computations  are  illustrated,  including  several  transforma- 
tions useful  in  image  understanding,  and  a  continuous  version  of  Kosko's  bidirectional  associative  memory. 
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1.   AN  EXTENDED  THEORY  OF  KNOWLEDGE 

"Though  it  be  allowed,  that  reason  may  form  very  plausible  conjectures  with  regard  to  consequences  of  such 
a  particular  conduct  in  such  particular  circumstances;  it  is  still  supposed  imperfect,  without  assistance  of 
experience,  which  is  alone  able  to  give  stability  and  certainty  to  the  maxims,  derived  from  study  and 
reflection. " 

—  David  Hume 

1.1    The  "New"  AI 

We  ajgue  that  AI  is  moving  into  a  new  phase  characterized  by  a  broadened  understanding  of  the  nature  of 
knowledge,  and  by  the  use  of  new  computational  paradigms.  A  sign  of  this  transition  is  the  growing 
interest  in  neurocomputers,  optical  computers,  molecular  computers  and  a  new  generation  of  massively 
parallel  analog  computers.  In  this  section  we  outline  the  forces  driving  the  development  of  this  "new"  AI. 
In  the  remainder  of  the  paper  we  present  the  theory  of  field  computers,  which  is  intended  to  be  a 
comprehensive  framework  for  this  new  paradigm. 

The  "old"  AI  has  been  quite  successful  in  performing  a  number  of  difficult  tasks,  such  as  theorem  prov- 
ing, chess  playing,  medical  diagnosis  and  oil  exploration.  These  are  tasks  that  have  traditionally  required 
human  intelligence  and  considerable  specialized  knowledge.  On  the  other  hand,  there  is  another  class  of 
tasks  in  which  the  old  AI  has  made  slower  progress,  such  as  speech  understanding,  image  understanding, 
and  sensorimotor  coordination.  It  is  interesting  that  these  tasks  apparently  require  less  intelligence  and 
knowledge  than  do  the  tasks  that  have  been  successfully  attacked.  Indeed,  most  of  these  recalcitrant  tasks 
are  performed  skillfully  by  animals  endowed  with  much  simpler  nervous  systems  than  our  own.  How  is 
this  possible? 

It  is  apparent  that  animals  perform  (at  least  some)  cognitive  tasks  very  differently  from  computers. 
Neurons  axe  slow  devices.  The  well-known  "Hundred  Step  Rule"*  says  that  there  cannot  be  more  than 
about  a  hundred  sequential  processing  steps  between  sensory  input  and  motor  output.  This  suggests  that 
nervous  systems  perform  sensorimotor  tasks  by  relatively  shallow,  but  very  wide  (i.e.,  massively  parallel) 
processing.  Traditional  AI  technology  depends  on  the  digital  computer's  ability  to  do  very  deep  (millions 
of  sequential  operations),  but  narrow  (1  to  100  processors)  processing.    Neurocomputing  is  an  attempt  to 
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obtain  some  of  the  advantages  of  the  way  animals  do  things  by  direct  emulation  of  their  nervous  systems. 

1.2    Theoretical  and  Practical  Knowrledge 

One  visible  difference  between  the  old  and  new  AIs  is  in  their  computational  strategies:  the  former  stresses 
deep  but  narrow  processing,  that  latter  shallow  but  wide  processing.  Underlying  this  difference  there  is  a 
deeper  one:  a  difference  in  theories  of  knowledge.  The  old  AI  emphasizes  propositional  (verbalizable) 
knowledge.  That  is,  it  assumes  that  all  knowledge  can  be  represented  by  sentence-like  constructs  (i.e., 
finite  ensembles  of  discrete  symbols  arranged  in  accord  with  definite  syntactic  rules).  The  propositional 
view  is  not  new;  it  goes  very  far  back,  arguably  to  Pythagoras.  Yet  there  is  considerable  evidence  that 
nonpropositional  knowledge  is  at  least  as  important.^ 

The  problems  of  practical  action,  as  opposed  to  theoretical  contemplation,  are  too  complicated  for  pro- 
positional  analysis.  The  real  world  is  simply  too  messy  for  idealized  theories  to  work.  Representation  in 
terms  of  discrete  categories,  and  cognition  by  manipulation  of  discrete  structures  referring  to  these 
categories,  may  be  appropriate  to  the  idealized  worlds  of  chess  playing  and  theorem  proving  (although  even 
this  is  doubtful  ).  However,  in  practical  action  the  context  looms  large,  as  does  the  indefiniteness  of 
categories  and  the  other  second  order  effects  that  propositional  representation  routinely  idealizes  away. 

Of  course,  the  approximations  of  propositional  representation  can  be  improved  by  a  deeper  theoretical 
analysis,  but  this  greatly  increases  the  computational  burden.  Traditional  AI  is  faced  with  a  dilemma: 
simple  theories  do  not  enable  skillful  behavior,  but  detailed  theories  are  computationally  infeasible.  There 
might  seem  to  be  no  way  to  avoid  this  tradeoff.  But,  recalling  the  Hundred  Step  Rule,  and  observing  that 
animals  behave  skillfully,  we  realize  that  there  must  be  a  third  alternative. 

The  limitations  of  traditional  AI  technology  show  us  the  limitations  of  theoretical  knowledge,  i.e., 
knowledge  that.  There  is,  however,  another  kind  of  knowledge,  which  we  can  call  practical  knowledge,  or 
knowledge  how.  For  example,  a  fish  knows  how  to  maintain  its  depth  in  the  water,  but  it  does  not  know 
that  neutral  buoyancy  is  achieved  by  adjusting  its  specific  gravity  to  that  of  water.  The  fish  does  not  have 
an  explicit  (propositional)  theory  of  how  temperature,  dissolved  substances,  etc.  affect  the  specific  gravity 
of  water,  nor  does  it  know  equations  describing  the  complex  manner  in  which  its  specific  gravity  depends 
on  the  state  of  its  body  (food  in  gullet,  air  in  air  bladder,  etc.  etc.).    Rather,  the  fish's  knowledge  how  is 
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represented  nonpropositionally  in  its  nervous  system  and  body. 

l.S    The  Acquisition  of  Kno>vIedge 

The  foregoing  suggests  that  by  discovering  how  to  represent  and  manipulate  practical  knowledge  the  new 
AI  may  accomplish  what  the  old  could  not.  There  are  difficulties,  however.  How  is  practical  knowledge 
acquired?  There  are  several  ways  theoretical  knowledge  is  acquired:  for  example,  it  may  be  taught.  Since 
propositions  can  be  encoded  in  verbal  structures,  language  can  be  used  to  transfer  theoretical  knowledge 
from  one  person  to  another.  (Of  course,  more  detailed  theories  require  correspondingly  larger  verbal  struc- 
tures for  their  encoding.)  Thus,  in  principle,  the  representation  of  theoretical  knowledge  in  a  computer  is 
straight  forward;  we  merely  have  to  design  an  appropriate  knowledge  representation  language.  In  effect 
theoretical  knowledge  is  transferred  from  human  to  computer  in  the  same  way  it  is  transferred  from  human 
to  human. 

Before  theoretical  knowledge  can  be  transferred  it  must  be  acquired  in  the  first  place.  The  original 
discovery  of  theoretical  knowledge  is  beyond  the  scope  of  this  paper.  Here  we  restrict  ourselves  to  the 
transfer  of  theoretical  knowledge  from  one  person  to  another;  this  is  the  case  that  is  most  important  for 
expert  systems  and  other  applications  of  traditional  AI  technology. 

Since  practical  knowledge  is  nonpropositional,  it  cannot  be  encoded  verbally.  This  does  not  mean  it 
cannot  be  taught,  however,  since  we  can  teach  by  showing  as  well  ais  by  saying.  Therefore,  although 
theoretical  knowledge  is  transferred  by  telling,  practical  knowledge  is  transferred  is  by  training.  Indeed  we 
often  speak  of  "training"  a  neural  network  to  accomplish  some  task. 

We  have  seen  how  practical  knowledge  may  be  transferred.  How  is  it  acquired  in  the  first  place?  In  a 
word,  by  adaptation.  In  nature  adaptation  occurs  predominantly  at  two  levels:  at  the  species  level  .'t  leads 
to  innate  practical  knowledge;  at  the  individual  level  it  leads  to  learned  practical  knowledge.  The  forego- 
ing suggests  that  where  the  old  AI  depended  on  verbal  encoding  and  transfer,  the  new  AI  will  emphasize 
training  and  adaptation  as  means  of  knowledge  acquisition. 
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2.   FIELD  TRANSFORMATION  COMPUTERS 

2.1   Massive  Parallelism 

The  preceding  section  suggests  that  the  new  AI  will  augment  the  traditional  deep,  narrow  computation 
with  shallow,  wide  computation.  That  is,  the  new  AI  will  exploit  massive  parallelism.  Now,  massive 
parallelism  means  different  things  to  different  people;  massive  parallelism  may  begin  with  a  hundred,  a 
thousand,  or  a  million  processors.  On  the  other  hand,  biological  evidence  suggests  that  skillful  behavior 
requires  a  very  large  number  of  processors,  so  many  in  fact  that  it  is  infeasible  to  treat  them  individually; 
they  must  be  treated  en  masse.  This  has  motivated  us  to  propose'  the  following  definition  of  massive 
parallelism: 

Definition  (Massive  Parallelism):  A  computational  system  is  massively  parallel  if  the  number  of  pro- 
cessing elements  is  so  large  that  it  may  conveniently  be  considered  a  continuous  quantity. 

That  is,  a  system  is  massively  pjirallel  if  the  processing  elements  can  be  considered  a  continuous  mass 
rather  than  a  discrete  ensemble. 

How  large  a  number  is  large  enough  to  be  considered  a  continuous  quantity?  That  depends  on  the  pur- 
pose at  hand.  A  hundred  is  probably  never  large  enough;  a  million  is  probably  always  large  enough;  a 
thousand  or  ten  thousand  may  be  enough.  One  of  the  determining  factors  will  be  whether  the  number  is 
large  enough  to  permit  the  application  of  continuous  mathematics,  which  is  generally  more  tractable  than 
discrete  mathematics. 

We  propose  this  definition  of  massive  parallelism  for  a  number  of  reasons.  First,  as  noted  above,  skill- 
ful behavior  seems  to  require  significant  neural  mass.  Second,  we  are  interested  in  computers,  such  as  opti- 
cal computers  and  molecular  computers,  for  which  the  number  of  processing  elements  is  effectively  continu- 
ous. Third,  continuous  mathematics  is  generally  easier  than  discrete  mathematics.  And  fourth,  we  want 
to  encourage  a  new  style  of  thinking  about  parallelism.  Currently,  we  try  to  apply  to  parallel  machines  the 
thought  habits  we  have  acquired  from  thinking  about  sequential  machines.  This  strategy  works  fairly  well 
when  the  degree  of  parallelism  is  low,  but  it  will  not  scale  up.  One  cannot  think  individually  about  the 
10  processors  of  a  molecular  computer.  Rather  than  postpone  the  inevitable,  we  think  that  we  should 
begin    now    to   develop   a   theoretical   framework   for   understanding   massively   parallel   computers.     The 


principal  goal  of  this  paper  is  to  propose  such  a  theory. 

2.2  Field  Transformation 

Our  aim  then  is  to  develop  a  way  of  looking  at  massive  parallelism  that  encompasses  a  variety  of  imple- 
mentation technologies,  including  neural  networks,  optical  computers,  molecular  computers  and  a  new  gen- 
eration of  analog  computers.  What  these  all  have  in  common  is  the  ability  to  process  in  parallel  amounts 
of  data  so  massive  as  to  be  considered  a  continuous  quantity.  This  suggests  that  we  structure  our  theory 
around  the  idea  of  a  field,  i.e.  a  contmuous  (dense)  ensemble  of  data.  We  have  in  mind  both  scalar  fields 
(such  as  potential  fields)  and  vector  fields  (such  as  gravitational  fields).  Any  operation  on  such  a  field, 
either  to  produce  another  field  or  to  produce  a  new  state  of  the  field,  can  be  considered  massively  parallel, 
since  it  operates  on  all  the  elements  of  the  field  in  parallel.  Indeed,  it  would  not  be  feasible  to  serialize  the 
processing  of  the  field;  modest  degrees  of  parallelism  cannot  cope  with  the  large  number  of  field  elements. 

In  the  remainder  of  this  paper  we  explore  field  transformation  computers,  that  is,  computers  character- 
ized by  the  ability  to  perform  (in  parallel)  transformations  on  scalar  and  vector  fields.  We  are  not  suggest- 
ing that  field  computers  are  unable  to  perform  scalar  calculations;  in  fact  we  assume  that  field  transforma- 
tion computers  have  the  scalar  capabilities  of  conventional  digital  and  analog  computers.  Scalars  have 
many  uses  in  field  computation.  For  example,  we  may  want  to  use  a  scalar  parameter  to  control  the  rate 
at  which  a  field  transformation  takes  place  (e.g.,  a  reaction  rate  in  a  molecular  computer).  Similarly,  we 
may  use  a  scalar  representing  the  average  intensity  of  a  field  to  control  the  concrast  enhancement  of  that 
field.    A  scalar  threshold  value  may  be  used  to  suppress  low  level  noise,  and  so  forth. 

An  important  reason  for  combining  field  computation  with  conventional  digital  computation  is  that  it 
permits  knowing  how  to  be  combined  with  knowing  that,  leading  t,o  knowledgeable,  skillful  behavior.  The 
combined  use  of  propositional  and  theoreiicai  knowledge  is  unfortunately  beyond  the  scope  of  this  paper." 

2.3  Classes  of  Field  Transformations 

Field  transformations,  like  filters,  can  be  divided  into  two  classes:  nonrecursive  and  recursive.  A  nonre- 
cursive  transformation  is  simply  a  functional  composition  of  more  elementary  transformations.  The  output 
of  a  nonrecursive  transformation  depends  only  on  its  input.  A  recursive  transformation  involves  some  kind 
of  feedback.    Hence,  its  output  depends  both  on  its  input  and  on  its  prior  state.    Recursive  transformations 


are  ideal  for  simulating  the  temporal  behavior  of  a  physical  system,  for  example  in  simulated  annealing* 
and  Boltzmann  machines.^ 

2.4    General  Purpose  Field  Computers 

Many  field  computers  are  designed  for  special  purposes;  this  has  been  the  case  with  field  computers  to  date, 
and  we  expect  it  to  be  the  case  in  the  future.  In  these  computers,  devices  implementing  field  transforma- 
tions (such  as  filters  and  convolutions)  are  assembled  to  solve  a  small  class  of  problems  (e.g.,  pattern  recog- 
nition). On  the  other  hand,  our  experience  with  digital  computation  has  shown  us  the  value  of  general 
purpose  or  programmable  computers.  This  architectural  feature  permits  one  computer  to  perform  a  variety 
of  digital  computations,  which  eliminates  the  need  to  construct  special  purpose  devices,  and  speeds  imple- 
mentation of  digital  algorithms. 

The  foregoing  observations  suggest  that  general  purpose  field  computers  will  be  sinnilarly  valuable.  In 
these  the  connections  between  field  transformation  units  and  field  storage  units  are  programmable,  thus 
facilitating  their  reconnection  for  a  variety  of  purposes.  In  fact,  we  may  want  to  make  better  use  of  our 
resources  by  multiplexing  the  use  of  field  transformation  units  under  the  control  of  a  program.  Thus,  a  pro- 
gram for  a  general  purpose  field  computer  might  look  very  much  like  a  conventional  program,  except  that 
the  basic  operations  are  field  transformations  rather  than  scalar  arithmetic. 

We  cannot  build  into  a  general  purpose  field  computer  every  transformation  we  might  need.  Instead  we 
must  choose  a  set  of  primitive  operations  that  permit  the  programming  of  all  others.  How  can  such  a  set 
of  primitive  operations  be  chosen?  How  can  we  be  guaranteed  that  we  have  provided  all  the  necessary 
facilities?  For  digital  computers  this  question  is  answered  in  part  by  computability  theory.  For  example, 
this  theory  shows  us  how  to  construct  a  universal  Turing  machine,  which,  given  an  appropriate  program, 
can  emulate  any  Turing  machine.  Although  the  universal  Turing  machine  is  hardly  a  practical  general 
purpose  computer,  consideration  of  it  and  other  universal  machines  shows  us  the  kinds  of  facilities  a  com- 
puter must  have  in  order  to  be  universal.  There  follows  the  hard  engineering  job  of  going  from  the  theoret- 
ically sufficient  architecture  to  the  practically  necessary  architecture. 

Can  the  same  be  accomplished  for  field  computers?  Is  there  a  universal  field  computer  that  can  emulate 
any  field  computer?    If  there  is  such  a  thing,  then  we  can  expect  that  it  may  form  a  basis  for  practical  gen- 
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eral  purpose  field  computers  in  much  the  same  way  that  Turing  machines  do  for  digital  computers.  In  the 
next  section  we  prove  that  general  purpose  field  computation  is  possible. 

S.   A  UNIVERSAL  FIELD  COMPUTER 

3.1  Introduction 

In  this  section  we  develop  the  general  theory  of  field  computation  and  prove  the  existence  of  a  universal 
field  computer.  In  particular,  we  show  that  with  a  certain  sec  of  built  in  field  transformations  we  can 
implement  (to  a  desired  degree  of  accuracy)  any  field  transformation  in  a  very  wide  class.  This  is  analo- 
gous to  the  result  from  Turing  machine  theory:  The  universal  Turing  machine  allows  us  to  implement  (to 
a  desired  degree  of  accuracy)  any  function  in  a  wide  class  (now  known  as  the  computable  functions). 

The  phrase  'to  a  desired  degree  of  accuracy'  appears  in  both  of  the  preceding  statements.  What  does  it 
mean?  For  the  Turing  machine  it  means  that  a  given  accuracy  (e.g.,  precision  or  range  of  argument)  can 
be  achieved  by  providing  a  long  enough  tape.  For  the  digital  computer  it  means  that  computations  are 
normally  performed  to  a  given  precision  (e.g.,  the  word  length),  and  that  finite  increments  in  the  desired 
precision  require  finite  increments  in  the  resources  required  (e.g.,  additional  registers  and  memory  ceils  for 
double  and  multiple  precision  results,  or  stack  space  for  recursion).  The  case  is  much  the  same  for  the 
universal  field  computer.  Finite  increments  in  the  desired  accuracy  of  a  field  transformation  will  require 
finite  increments  in  the  resources  used  (such  as  field  transformation  and  storage  units). 

There  are  a  number  of  theoretical  bases  for  a  universal  field  computer.  We  have  investigated  designs 
based  on  Fourier  analysis,  interpolation  theory  and  Taylor's  theorem,  all  generalized  for  field  transforma- 
tions. In  this  paper  we  present  the  design  based  on  Taylor's  theorem.  There  are  no  doubt  as  many  princi- 
ples upon  which  universal  field  computers  ran  be  based  as  there  are  bases  for  universal  digital  computers. 

3.2  Taylor  Series  Approximation  of  Field  Transforms 

In  this  section  we  develop  the  basic  theory  of  functions  on  scalar  and  vector  fields  and  of  their  approxima- 
tion by  Taylor  series.  Most  of  the  definitions  and  theorems  in  sections  3.2  and  3.3  have  been  previously 
published*';  they  are  reproduced  here  for  completeness.  Once  it  is  understood  that  fields  are  treated  as 
continuous-dimensional  vectors,  it  will  seen  that  the  mathematics  is  essentially  that  of  finite-dimensional 
vectors.    Note  that  the  treatment  here  is  heuristic  rather  than  rigorous.    First  we  consider  scalar  fields; 
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later  we  turn  to  vector  fields. 

As  usual  we  take  a  scalar  field  to  be  a  function  0  from  an  underlying  set  f2  to  an  algebraic  field  K,  thus 
<^:  n  -»  /f .  For  our  purposes  K  will  be  the  field  of  real  numbers,  IR.  We  use  the  notation  ^(fi)  for  the 
set  of  ail  scalar  fields  over  the  underlying  set  H  [K  =  IR  being  understood).  Thus,  $(11)  is  a  function 
space,  and  in  fact  a  linear  space  under  the  following  definitions  of  field  sum  and  scalar  product: 

Note  that  we  often  write  (^j  for  0(^),  the  value  of  the  field  at  the  point  t.  As  a  basis  for  this  linear  space 
we  take  the  unit  functions  uj^  for  each  t  S  Q.    They  are  defined 

uj,{t)    =    1 

uJt{s)    =    0,   U  si^t  (2) 

Note  that 

(i>    =     I   <jj,d>,  dt 

Jo    '"^^  (3^ 

The  preceding  definitions  show  that  we  can  think  of  scalar  fields  as  vectors  over  the  set  Q.  Since  we  want 
to  be  quite  general,  we  assume  only  that  0  is  a  measurable  space.  In  practice,  it  will  usually  be  a  closed 
and  bounded  subspace  of  £"*,  n-dimensional  Euclidean  space.  Thus  we  typically  have  one,  two  and  three 
dimensional  closed  and  bounded  scalar  fields. 

Since  n  is  a  measure  space,  we  can  define  an  inner  product  between  scalar  fields: 

J^  (4) 

We  also  define  the  norm: 

ll<^ll    =      L\<f>t\   d^- 

^^  (5) 

Thus  $(n)  is  the  function  space  Xj(f2).    Note  that  the  UJ^  are  not  an  orthogonal  set  under  this  norm,  since 
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We  first  consider  scalzu-  valued  functions  of  scalar  fields,  that  is  functions  /:  ^(H)  ->  H.    We  prove 
some  basic  properties  of  these  functions,  culminating  in  Taylor's  theorem. 

Definition  (Differentiability):  Suppose  /  is  a  scalar  valued  function  of  scalar  fields,  /:  $(0)  -♦  K,  and 
that  /  is  defined  on  a  neighborhood  of  ip  €  ^(0).  Then  we  say  that  /  is  differentiable  at  <f>  i£  there  is  a 
field  G  €  ^(f^)  such  that  for  all  a  in  this  neighborhood 

f{<f,^a)-f{4>)    =    a'G+r,\M 

(6) 

where  17  ->  0  as  ||q:||  ->  0.    We  will  later  call  G  the  gradient  of  f  dX  (j>. 
Theorem:    If  /  is  differentiable  at  (j)  then  /  is  continuous  at  (f>. 
Proof:   Since  /  is  differentiable  at  (j)  we  know 

fW-m  =  {rp  -  <^yG  +  nWi^  -  <i>\\. 

Therefore, 

l/(^)-/(0)l     =    I   {^P-<t>yG  +r,\\rl;-<^\\\ 
^    \{rP-<P)-G\    +\r,\    \\rp-<f>\\ 
^    \\G\\  U-<f>\\  +\r,\    ||^-<^|| 
=    (IICII  +i;7|)  \\^-4>\\. 
Thus  /  is  continuous  at  <^.   ■ 

The  quantity  a  '  G  whose  existence  is  guaranteed  by  differentiability  is  called  the  directional  deriva- 
tive of  /  with  respect  to  a  at  (p.    It  is  defined  directly  as  follows. 

Definition  (Directional  Derivative):  The  directional  derivative  in  the  "direction"  a  is  given  by  the 
following  limit: 

VJW    -    fW-   nrnM±I^^JJA 

aa  h-^0  a  (7) 

We  use  Va/  ^^^  df/da  interchangeably  for  the  directional  derivative.  Note  that  if  /  is  differentiable  at  4> 
thenVa/(^)  =  c^-  G. 
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Lemma:   If  /  is  differentiable  in  a  neighborhood  of  (j>,  then 


dx 


Proof:  By  the  definition  of  the  derivative: 

d     ,<  ,    ,         >  1-       /f0  +  {x+h)a]  -  /((ji)  +  la) 

— -  f{4>  -T  xa)    =    hm  ^-^ ^ — '— 

dx  a-O  h 


_    ,.       f{4>  +  xa  4-  /to;)  —  /(c^  -i-  xa) 


The  last  step  follows  by  the  definition  of  the  directional  derivative.     ■ 

Theorem  (Mean  Value):    Suppose  /:  ^(f2)  ^-  IR  is  continuous  on  a  neighborhood  containing  (t>  and  xp. 
Then,  there  is  a  ^,  0  ^  ^  ^  1,  such  that 

firP)  -  m    =    {rP-6)-  V/(X) 

where   x    =    ^  +  ^(V'  -  0)       "  (^) 

Proof: '  Let  a  =  ip  —  <f)  and  consider  the  function 

^  F(a:)    -    /{</>  +  xa)  -  f{4>)  -  xi/(«;)  -  /(<?>)]. 

Since  /  is  continuous,  so  is  F.    Now,  since  F[0)  =  F{1)  =  0,  we  have  by  Rolle's  Theorem  that  there  is  a 
d,0  -^  e  <:  1,  such  that  F'  [$)  =  0.    Note  that 

F'{x)    -    Mf[<p  -r  xa)  -  /(0)  -  xl/(^)  -  /(<^)]i 


±f{<P  ^  xa)  -  [f{rp)  -  /(<?,)] 
dx 


By  the  preceding  lemma 

F'{^)     =    Vaf{<l>+^^)-[fW-  f{4>)\ 
-11- 


Hence,  substituting  9  for  i, 

0  =  F'{e)  =  x/J{<i>  +  ea)  -  [/{rP)  -  f{<f>)l 

Therefore,  transposing  we  have 

t 

and  thejtheorem  is  proved.   ■ 

i 
i 
I 

Theorem  (Taylor):    Suppose  that  /  and  ail  its  directional  derivatives  through  order  n  +  1  are  continuous 


in  a  neighborhood  of  <t>.    Then  for  all  a  such  that  0  +  o:  is  in  that  neighborhood  there  is  a  ^,  0  5$  ^  ^  1, 

1 

! 
I 

such  th4t 


;  2  nl  n-hl  ! 


(n+1)!     "     '  '  '    (10) 


Proof:  By  the  Taylor  theorem  on  real  variables, 

i  /(0   +  ta)     =     /(0)    +  A  /(^)i   +  1-^  /(<^)i2   ^     .  .  .     +  J^  Jl  ;(<^)^n   + 


1  H'*+'^ 

i  (n  +  l)!  dp-^i  ^      -        ' 

i 

Observe  that  by  the  preceding  lemma 


n  +  l 


i 

Therefore, 


/{</>  -r  ta)    =    VaV(0  -  ^«)- 


2  n!  (n-ri); 

Setting!  t  -  1  gives  the  desired  result.  ■ 

The  extension  to  a  function  of  several  scalar  fields  is  routine. 

Since  our  "vectors"  are  continuous  dimensional,  partial  derivatives  are  with  respect  to  a  "coordinate" 
f  G  n  rather  than  with  respect  to  a  coordinate  variable.  To  define  partial  derivatives  it's  convenient  to 
make  use  of  the  Dirac  delta  functions,  ^^  for  f  G  H: 
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St{t)    =   00 

St{s)    =   0,  ioT  s  ^  t 


(11) 


Of  course,  by  the  first  equation  above  we  mean  SAs)  =  lim  e      for  |  s  —  t\    <  c/2.   Note  the  following  pro- 
perties  of  the  delta  functions  (fields): 

\\St\\  =  1 


(12) 


Given  the  delta  functions  the  partial  derivative  of  /  at  coordinate  t  is  simply  expressed: 

^   -  n  f 


(13) 


Theorem:   If  /  is  differentiable  at  ^  then  the  first  order  partial  derivatives  exist  at  (f>. 
Protyf:  First  observe  that  by  differentiability 

h  h 

=    8^  •  G  -r  ri\\6^\\  \  h\  / h 
=    8^-  G  ^r,\h[/h 


=    G,  +n\h\lh 


Recalling  that  ?7  — »  0  as  A  —  0,  observe 


lim 


f{4>  +  h8,)  -  /(0) 


-Gt 


lim  I  G^  +  ^1  h\  /h 

h-'O 


lim  I  r]\ 
A-0 


Gt\ 


=   0 


Hence,  f{4')  ~  G^,  where  G  is  the  field  whose  existence  is  guaranteed  by  differentiability.    Thus  the 

o8f 
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partial  derivative  exists.   ■ 

What  is  the  field  G  whose  points  are  the  partial  derivatives?   It  is  just  the  gradient  of  the  function. 

Definition  (Gradient):   The  gradient  of  /:  ^(fi)  -*  IR  at  cz!)  is  a  field  whose  value  at  a  point  t  is  the  par- 
tial  derivative  at  that  point,  rrr-f  {(/>)• 

lv/(*)l,  =  ^fW.  ,^^, 

Since,  by  Eq.  3,  \7f{<i>]  =    I    '^t[^fi^)]t  '^^^  ^^^  gradient  can  also  be  expressed  in  terms  of  the  basis  func- 
tions  and  the  partial  derivatives: 


V/(0)    =    J^u,-^f{<f>)dt 


d_ 
dSt  '  '^'  """  (15) 


When  no  confusion  will  result,  we  use  the  following  operator  notations: 

V/   =    /„  ""^  ^^^^^*  ^^ 

V    =    f^^t  ^l^^i  di  (16) 

Finally,  since  \/f{<p)  =  G,  the  field  guaranteed  by  differentiability,  and  Va/(*^)  —  ot  '  G,  we  know 

or.  in  operator  form,  d/dct  =  ^^  =  a  •  V- 

3.3    A  Universal  Field  Computer  Based  on  Taylor  Series  Approximation 

We  can  use  Taylor's  Theorem  to  derive  approximations  of  quite  a  general  class  of  scalar  valued  functions 
of  scalau"  fields.  Thus,  if  we  equip  our  universal  field  computer  with  the  hardware  necessary  to  compute 
Taylor  series  approximations,  then  we  will  be  able  to  compute  any  of  a  wide  class  of  functions  (namely, 
those  functions  whose  first  n  partial  derivatives  exist  and  are  continuous).  Therefore,  consider  the  general 
form  of  an  n-term  Taylor  series: 
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fi</>)   «    E  -rrVafM,      where   a  =  <t>  -  <I>q 

jfe=0   *•  \^^) 


What  heirdware  is  required?  Clearly  we  will  need  a  field  subtracter  for  computing  the  difference  field 
a  —  (f>  —  <pQ.  We  will  also  need  a  scalar  multiplier  for  scaling  each  term  by  1/ k\;  we  wiU  also  need  a  scalar 
adder  for  adding  the  terms  together.  The  heirder  problem  is  to  find  a  way  to  compute  Va/(*?^o)  ^^^  ^  vector 
a  that  depends  on  the  (unknown)  input  <p.  The  trouble  is  that  the  as  and  the  \/s  are  interleaved,  as  can 
be  seen  here: 


We  want  to  separate  everything  that  depends  on  a,  and  is  thus  variable,  from  everything  that  depends  on 
/(<^o)>  ^^'^  is  thus  fixed.  This  can  be  accomplished  (albeit,  with  extravagant  use  of  our  dimensional 
resources)  by  means  of  an  outer  product  operation.  Therefore  we  define  the  outer  product  of  two  scalar 
fields: 

(19) 
Note  that  \(  (p,  rp  6  $(11)  then  0  /\  rp  €  <^{n^). 

To  see  how  the  outer  product  allows  the  variable  and  fixed  parts  to  be  separated,  consider  first  the  case 
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=   /n/n(«  ^  ^)..<  (V).(v)t  fM  d«  ds 

=  /n/n(«  A  a),,,  (V  A  V),,,  f{<Po)  di  ds 
■      =    Xi,(«Aa),(vAv).dx/(0o) 
:       =    (a  A  a)  •  (V  A  V)  /(0o) 
Now  we  caji  see  how  the  general  case  goes.    First  we  define  the  A;-fold  outer  product: 


0l*+i!    =    ^t\0\'^  (20) 


Then, 

The  n-term  Taylor  series  then  beconaes 
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m    «    S  jr(<?^-^o)''l-V*!/(0o)  (22) 


Since  (p^  is  fixed,  we  can  compute  each  v^  /{(Po)  once,  when  the  field  computer  is  programmed.  Then,  for 
any  given  input  4>  we  can  compute  (0  —  0o)  ^.nd  take  the  inner  product  of  this  with  n/  /(0o)-  Thus,  in 
addition  to  the  components  mentioned  above,  computing  the  Taylor  series  approximation  also  requires 
outer  and  inner  product  units  that  will  accommodate  spaces  up  to  those  in  ^(fi"). 

We  consider  a  very  simple  example  of  Taylor  series  approximation.  Suppose  we  want  to  approximate 
defint  ^,  which  computes  the  definite  integral  of  (p,  defint  <p  =  I  (p^  ds.  First  we  determine  its  partial 
•derivative  at  t  by  observing: 
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defint  {(p  +  hSt)  -  defint  <f>  Cd,  +  hSt{s)  ds  -    f  6,  ds 

lim ; =   lim  -'^ '^ 

i-o  ft  fc-o  ft 


=    lim  h\\St\\/h    =    1 


d 


(23) 


Thus,  delint  0=1,  and  we  can  see  that 

dSt 

V  defint  ^    =    1, 

where  1  is  the  constant  1  function,  Ij  =  1.    This  leads  to  a  one  term  Taylor  series,  which  is  exact: 

defint  <p    =    0  ■  1 

(24) 

Note  that  1  is  a  fixed  field  that  must  be  loaded  into  the  computer. 

3.4    Transformations  on  Scalar  and  Vector  Fields 

The  previous  results  apply  to  scalar  valued  functions  of  scalar  fields.  These  kinds  of  functions  are  useful 
(e.g.,  to  compute  the  average  value  of  a  scalar  field),  but  they  do  not  exploit  the  full  parallelism  of  a  field 
computer.  Achieving  this  requires  the  use  of  functions  that  accept  a  (scalar  or  vector)  field  as  input,  and 
return  a  field  as  output.  We  briefly  sketch  the  theory  for  scalar  field  valued  functions  of  scalar  fields; 
transformations  on  vector  fields  are  an  easy  extension  of  this. 

By  a  scalar  field  valued  transformation  of  scalar  fields  we  mean  a  function  F:  ^(11^)  -*  $(f2o).  Such  a 
transformation  is  considered  a  family  of  scalar  valued  functions  /('.  $(17^)  — ►  IR  for  each  t  £  f2o;  these  are 
the  component  functions  of  F.    Note  that  F  can  be  expressed  in  terms  of  its  components: 

-'"2  (25) 

More  briefly,  F  =    |     f^ui^  dt.   F  is  decomposed  into  its  components  by  6^  '  F(<^)  =  /({</>)■ 

Next  we  turn  to  the  diff'erentiability  of  field  transformations.  To  define  this  it  is  necessary  to  first 
define  an  analog  to  the  inner  product  for  fields  of  diff"erent  dimension.    Thus,  we  define  the  continuous- 
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dimensional  analogue  of  a  vector-matrix  product: 

where  the  transpose  of  a  field  is  defined  ^j^  =  $,j.    With  this  notation  differentiability  can  be  defined. 

Definition  (Differentiability  of  Field  Transformations):  Suppose  F:  ^(fi^)  —^  ^(^2)  ^^  *  ^®^^ 
valued  function  of  scalar  fields  defined  on  a  neighborhood  of  <?i  €  ^(n^).  We  say  that  F  is  differentiable  at 
<f>  provided  there  is  a  field  F  €  ^{Vt-^xVl^  such  that  for  all  a  in  the  neighborhood 

'F{<j>  +  a)  -F(<^)    =    ar  +  H|Iq;|! 

(27) 

where  H  €  ^(^2)  and  ||H||  — ►  0  as  ||q!||  -♦  0.    We  will  show  that  F  is  the  gradient  of  F  at  ^. 

Next  we  consider  the  directional  derivative  of  a  field  transformation.  For  a  scalar  function  /,  Va/(^) 
is  a  scalar  that  describes  how  much  f{(j>)  changes  when  its  argument  is  perturbed  by  a  small  amount  in  the 
"direction"  a.  For  a  field  transformation  F,  Va^(^)  should  be  a  field,  each  component  of  which  reflects 
how  much  the  corresponding  component  of  F(^)  changes  when  <l>  moves  in  the  "direction"  oc.  That  is, 
[VaF(0)],  =Va<5,  •F(<^).    Hence, 

V,F(<^)   =     L^t^a^i  •F(<^)di, 

-'"i  (28) 

or,  more  briefly,  Va^  ~    I     '^t  Va  ^t  '  ^  ^^-    ^^^  directional  derivative  is  defined  directly  by  a  limit. 

Definition  (Directional  Derivative  of  Field  Transformations):  If  F:  ^(f^i)  -^  ^(^2)  ^^ 
differentiable  at  0  then 

dor  A-0  h  (29) 

It  is  obvious  that  SJ^{4>)  =  a;F,  where  F  is  the  field  whose  existence  is  guaranteed  by  differentiability. 
This  suggests  the  definition: 

Definition  (Gradient  of  Field  Transformation):    The  gradient  of  F  at  0  is  the  field  whose  value  at  a 
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point  t  is  the  partial  derivative  of  F  at  that  point: 


1^^(^)1'    =    f  (^'  ,30, 


Since  S7s^{<p)  —  ^{^  -  Tj,  it  follows  that  V-F(<p)  ~  ^i  ^.s  expected. 

Notice  that  the  gradient  is  of  higher  dimension  than  che  field  transformation.  That  is,  if 
F:  ^(Hj)  -*  *^(f^2)'  ^^^^  V^-  ^(^i)  """  '^(^1  ^  ^2)-  Higher  order  gradients  will  have  similarly  higher 
orden 

V*F:    $(^1)    -    ^{^1*  X  Ho),    for   F:    ^Q^)    ^    ^Q.) 

(31) 

A  derivation  similar  to  that  for  scalar  field  functions  yields  the  Taylor  series  for  field  transformations: 

As  before,  outer  products  can  be  used  to  separate  the  variable  and  fixed  components. 

We  illustrate  the  Taylor  series  computation  for  a  simple  but  important  class  of  field  transformations, 
the  integral  operators.  For  example,  the  derivative  and  difference  transformations,  which  axe  very  useful  in 
image  processing,  are  integral  operators,  as  will  be  shown  below. 

Definition  (Integral  Operator):    A  field  transformation  F:  *^(f2i)  — '  "^(^c)  ^^  ^^  integral  operator  if 
there  is  a  field  ^  €  ^(Ho  x  Hj^),  called  its  kernel,  such  that 

'F{4>)   =   ^(p 


Recall  (*<^),  -  .f^^^a^.^.dt. 


Since  F(0  +  rp)  ^  '^{<f>  +  V')    =    ^0  +  ^r/*  =  F(<^)  ^  F(^),    it's   clear  that   integral   operators    are 
linear.    Thus,  their  gradients  are  especially  easy  to  compute: 

Theorem  (Gradient  of  Integral  Operator):    The  gradient  of  an  integral  operator  is  the  transpose  of 
its  kernel. 
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Proof:    Suppose  F(^)  =  ^<t>  is  an  integral  operator.     Observe  F(<?i  +  a)  —  F(0)  =  ^a  =  a'^   ,  where 
f'J  =  ^,^.    Hence  VF(<^)  =  *'^.    ■ 

Since  the  gradient  of  a  constant  function  is  zero,  all  higher  order  gradients  vanish,  so  the  Taylor  series  for 
an  integral  operator  is  exact:    F(<^)  =  <z!>vF  =  <p^   ■ 

Next  we  show  that  the  derivative  and  difference  operations  on  scalar  fields  are  integral  operators  and  we 
compute  their  Taylor  series.    First  consider  the  finite  difference  transformation  A^  denned  so  that 


(34) 


[35] 


for  a  given  h  >  Q.    To  see  that  this  is  an  integral  operator  observe: 

^'i>t   ^   ^t+h  ■  <p  -  Si  ■  d   -   (<5(^^  -  St)  ■  (/> 
Define  the  field  "^  by  ^^„  =  <5j^^(«)  —  <^^(u)  and  ii  follows  that  Ac/)  =  ^4),  since 


The  only  trouble  with  this  formula  for  the  finite  difference  is  that  the  field  ^  is  not  physically  realizable, 
since  it  makes  use  of  the  Dirac  functions.  In  practice  we  have  to  replace  S^^f^  and  Si  by  finite  approxima- 
tions, but  the  resulting  approximate  difference  transformation  is  still  an  integral  operator.  The  same 
applies  CO  the  derivative  transformation,  which  can  be  approximated  by  the  approximations  to  the  first 
and  higher  order  derivatives. 

To  further  illustrate  the  Taylor  series  for  scalar  field  valued  transformations,  we  consider  pointwise 
transformations.  A  pointwise  transformation  applies  a  scalar  function  (but  not  necessarily  the  same  func- 
tion)  to  every   element  of  a  field.     That  is.  F:  *^(n)  — '  *^(^)   is  a  pointwise  transformation  if  for  some 

ft-.  IR  -  IR, 

{P{</>)\t  =  ft{</>t) 

(37) 

Note  that 

SrF{c^)  =  ftM- 
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(38) 


Lemma:   KF:  ^(fi)  ->■  ^{^)  is  a  pointwise  transformation,  then 

where  [F  '(^)]^  =  /( '{<t>t)y  and  /^ '  is  the  derivative  of  /j. 

Proof:   By  the  definition  of  the  derivative: 

8^  ■Y{4>+ha)  -  Sf  ■-F{<f>) 


ft— »0 


lim ; 

h-'O  h 

lim 


=   lim  aj/j  '{<^t)  +  ^1  ^<^«l  /^ 


=    o^tft'{<f>t) 


8,  •aF'(<^) 


(39) 


Theorem  (Directional  Derivative  of  Point-wise  Transformation):    If  F:  ^{Q)  — ►  ^(f^)  is  a  point- 
wise  transformation,  then 

V,F(0)    =    aF'(0) 

(40) 

Proof:   Applying  the  preceding  lemma: 
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=    j^uj.S^  •ar'(^)df 

=    aj^uj^S,  •F'(^)dt 

=   a-F'{<t>) 
The  last  step  follows  by  Eq.  12.     ■ 

dToroQary:    The  directional  derivative  of  a  pointwise  transformation  is  a  pointwise  transformation. 

Proof:  This  follows  immediately  from  the  preceding  theorem.     ■ 

The  theorem  and  its  corollary  lead  to  the  Taylor  series  expansion  of  a  pointwise  transformation: 

Theorem  (Taylor  Series  for  PointAvise  Transformation):  K  F:  ^(f2)  -^  ^(^)  is  ^  pointwise 
transformation,  and  its  component  functions  and  all  their  derivatives  through  n  +  1  are  continuous  in  a 
neighborhood  of  (f),  then  for  all  a  such  that  <;^+a  is  in  that  neighborhood  there  is  a  field  ^,  0  ^  ^^  ^  1, 
such  that 


F(</>  +  c.)    =    f] -Lc.*F('=)(<^)  +  _l_a«^^F(''+i)(<A  +  9a) 


k=Q 


(41) 


Here  F^*^  is  the  kth.  derivative  of  F: 


[^^'\<P)\t    =    f\'\h)    = 


dx* 


r=<p, 


(42) 


'here  f^  =  S^  ■  Y . 


Phoof:   By  the  Taylor  theorem  for  functions  of  reals: 


[F#  +  a)],    =    /,(0,  +  a,) 


=  E^«f//*H«^J  +  t;:^!)!  ""''^'"''^^^'  "^  ^'""'^ 


A=0 


=    E-l[a^F('=)(<^)],  +  -^_I-_[a-^F("-^)(<^  +  HI, 


*=o^! 
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This  results  permits  converting  Taylor  series  for  scalar  functions  to  the  Taylor  series  for  the  corresponding 
pointwise  field  transformations.  Notice  also  that  the  resulting  series  do  not  require  outer  products,  pro- 
vided we  )iave  a  field  multiplier,  {<l>fp)t  ~  4>i^f 

We  illustrate  this  theorem  by  determining  a  Taylor  series  approximation  for  In,  the  function  that  com- 
putes the  natural  logarithm  of  each  field  element. 

Tiieorem  (Taylor  Series  for  Pointwise  Logarithm):   Suppose  (In  (6);    -    In  (2i>j.   Then 

\n6   =    {<I>-1)~  \{<l>  -  1)2  +  1(<^  -  1)3  -    ...   +  ^-^T~\4>  -  1)" 
2  3  n 

•    (n  +  l)! 

provided  |  ^j  —  1|    ^1  and  (pj  ^  0,  for  all  f  6  fi. 

Proof:         Note        that        for        ;k>0,        W*^  ^  =  (- l)*"^(jfc  -  1)!  /  0^.  Therefore,        for        /t>l, 

iToi^)  1  =  (_i)'=-l(jt  -  1)!.    By  the  Taylor  theorem, 


In  (1  +  a)    =    In  1  +  |]-lc.r*W'='  1 L—a^  +  ^n^^^ll  ^  Qa) 


.=1^!  (n+1)! 


Y^±a\-lf~\k  -  1)!  +  _l_a''-nni^)(l  +  Oa) 
j^^^kl  (n  +  1)! 


"    (- 

k=l 

k 

-1 

a' 

-I- 

1 

[n 

:  +  l)! 

To 

prove 

the 

theorem  let 

a  = 

<P 

— 

1. 

a 

a"+ V*)(l  +  ^a) 


We  consider  vector  fields  briefly.    Recall  that  any  three-dimensional  vector  field  <^  can  be  considered 
three  scalar  fields  (j>,  ip,  x  where 

^t    =   4>ti  +  0J  +  xtk 

(44) 

Similarly,  a  function  that  returns  a  three-dimensional  vector  field  can  be  broken  down  into  three  functions 
that  return  scalar  fields.    Thus,  we  see  that  a  transformation  on  finite  dimensional  vector  fields  can  be 
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implemented  by  a  finite  number  of  transformations  on  scalar  fields. 

To  ensure  the  continuity  of  field  valued  functions,  certain  restrictions  must  be  placed  on  the  fields  per- 
mitted as  arguments.  Although  these  restrictions  are  still  under  investigation,  we  believe  that  it  is 
sufficient  chat  the  input  field's  gradient  be  bounded  at  each  stage.  This  will  be  the  case  for  ail  physically 
realizable  fields.  This  restriction  on  allowable  Inputs  finds  its  analogy  in  digital  computers:  legal  input 
numbers  are  restricted  to  some  range;  numbers  outside  that  range  may  cause  underflow  or  overflow  in  the 
subsequent  computation.  In  the  same  way  here,  fields  whose  gradients  are  too  large  may  lead  to  incorrect 
results. 

4.   EXAMPLE  APPLICATION:   BIDmECTIONAL  ASSOCIATIVE  FIELD  MEMORY 

In  this  section  we  illustrate  the  theory  of  field  computation  by  analyzing  a  continuous  version  of  Kosko's 
hidirtctional  associative  memory.    The  system  operates  as  follows  (see  Figure  1). 


out. 


in. 


m^ 


Figure  1.    Bidirectional  Associative  Field  Memory 

The  goal  is  to  store  a  number  of  associative  pairs  (0'*',  (j)^^'),  where  <p^*'  G  ^(^ii)  and  0'*^'  6  ^(^2). 
(We  assume  these  fields  are  bipolar,  that  is,  take  values  in  {  —  1,  ^1},  although  physical  realizability  actu- 
ally requires  continuous  variation  between  —1  and  +1.)  Presentation  of  a  field  <f>  6  ^(J^i)  at  illj  eventu- 
ally yields  at  out^  and  out2  the  pair  j  for  which  (f>^^'  is  the  closest  match  for  <f>.  Similarly,  presentation  of 
tjj  at  in2  will  yield  the  pair  for  which  xj}^^'  is  the  closest  match.    The  pairs  are  stored  in  a  distributed 
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fashion  in  the  field  ^  G  ^(f^2  ^  ^l)  computed  as  follows: 

^    =    0(1)  /\  <^(1)    +     .  .  .      +    0(«)  /\  <^(«) 

(45) 

Note  that  ip^  '  A  <i>^  '  reflects  the  cross  correlations  between  the  values  of  tp^  '  and  the  values  of  (^^  '. 

Two  of  the  boxes    Fig.  1  perform  "matrix- vector"  multiplications,  ^<p  or  ip^.    Thus  presentation  of  a 
field  0  at  inj  yields 

0'    =    5(0,  *0) 

(46) 

at  outj.  Here  5  is  a  nonlinear  function  that  helps  to  suppress  crosstalk  between  stored  pairs  by  forcing 
field  values  back  to  —1  or  -hi  (it  is  described  below).    Computation  of  0  in  turn  yields 

<p'    =   5(0,0'*) 

(47) 

at  out2.  On  the  next  iteration  we  get  0 '  '  =  5(0',  '^(p)  and  </»''  =  5(0',  0'  '*),  and  so  forth.  Each 
iteration  will  yield  a  closer  approximation  of  the  desired  (0'  ',  0(  ').  We  cannot  in  general  guarantee  that 
the  system  will  stabilize  (i.e.,  that  the  0  and  0  will  become  constant),  but  we  will  show  that  the  changes 
will  become  as  small  as  we  like.  Kosko  can  show  stability,  since  the  discreteness  of  his  fields  places  a  lower 
bound  on  nonzero  change. 

The  nonlinear  function  S  updates  the  output  field  in  the  following  way  [illustrated  for  the  computation 
0'  =  5(0,^0)]: 


^'t    = 


+1  if*,-0  >e^ 

0,      if -^2^   *r<?^^^l  (4g) 

-1     if  *j-<3  <   -^2 


Thus,  the  value  of  a  field  element  0^  is  not  changed  if  —$2  ^  "^t  '  ^  ^  ^V  where  the  thresholds  0^,  $2  >  ^■ 
The  rule  for  0 '  =  5(0,  0*)  is  analogous. 

Following  Hopfield^  we  show  that  the  stored  pairs  (0(  ',  (p^  ')  are  stable  states  in  the  dynamic  behavior 
of  the  memory. 

Theorem    (Stability    of    Stored    Pairs):     Any    stored    pair    is    stable,    provided    |  Hjl    >  0,    where 
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9  =  max  (^j^,  6^.    (This  condition  holds  for  realistic  values  of  the  thresholds.) 
Proof:  Suppose  that  (V*     ?  "^     )  is  one  of  the  stored  pairs  and  observe: 

k 
k 

k 

The  expression  (^^  '  •  0^-''  measures  the  correlation  between  0^  '  and  (p^^',  which  varies  between  — j  Hjl 
and  +1  rijl  .    Notice  that  for  j^k  this  expression  has  mean  value  0.    On  the  other  hand,  when  j=k  its 
value  is  I  n^i  .    Hence, 

^0(^)  «  ^O)]  fill 

(49) 
Now  consider  the  crucial  expression  from  Eq.  48: 

(50) 
If  ^0)  =  +1,  then 

^^■(f>^^^    «    I  fill    >  0^ 
and  so  0'^  -  -t-1.    Similarly,  if  ^[•''  =  —1,  then 

and  so  t/*'^  =  —1.    In  both  cases  i^' i  -  V^j    ■    Since  0'  =  ^'•''  and  (by  similar  reasoning)  (^ '  ='  (fy-^',  we 
see  that  any  stored  pair  {ip    \  4>     )  is  stable.     ■ 

The  preceding  theorem  shows  that  the  stored  pairs  are  stable,  but  it  does  not  show  that  they  will  ever 
be  reached.  Therefore  we  prove  several  theorems  establishing  conditions  under  which  correct  recall  takes 
place.  The  following  theorem  shows  that  if  <f)  is  sufficiently  close  to  one  (jr^'  and  sufficiently  far  from  all 
the  others,  then  perfect  recall  occurs  in  one  step. 

Theorem  (Close  Matches  Lead  to  Correct  Recall):    xjy'  =  V'     i  provided  that 
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<j>^j)-<f) -e>  Y,  <f>^^^-<t> 

*#J  (51) 

where  0  =  max  [B^,  O^).   Similarly,  ((>'  =  4>''^'>  provided  ^^^"l-tp  -  6  >  Yi  ^^''''•<f>- 

Proof:  For  notational  convenience  let  (Tj^  =  <p^  '-0  be  the  similarity  of  d^  '  and  4>.    Thus  we  are  assuming 

i^*]  (52) 

We  need  to  show  tp'  —  V'     i  so  consider  ^</>: 

^0  =   (S  V'^*^  A  <^^*V 

k 

=  Y  [4^^  A  <i>^'^)<t> 

k 

k 

Jb  (53) 

This  equation  is  plausible:    it  says  that  each  ?/''  '  contributes  to  the  result  to  the  extent  the  corresponding 
(jr  '  is  similar  to  (f).    From  Eq.  53  we  have 

A^J  (54) 

Now  we  consider  the  cases  xpy'  =  +1  and  ip\^'  =  —  1.    If  tpy'  =  ^1  then 

^     ^;    -    E    ^. 

k^] 

>   9   ^   9^ 
Hence  i^^'  =  +1  =^  ip[^\    If  tyj-''  =  -1  then 


^i-<i>  =  -<T,  +  E  <^krt 

^     -^;    -^     E   ^* 


(*) 
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<     -^    ^     -$2 

Hence  xp^'  =  —1  =  tpy'.   In  either  case  ^^ '  =  ^p'.   The  proof  that  4>'  =  <fy-^'  is  analogous.    ■ 

The  preceding  proof  makes  very  pessimistic  assumptions,  namely,  that  the  other  i}A  '  are  negatively 
correlated  with  ip^^,  and  thus  maximally  interfering.  In  the  more  likely  case,  where  they  are  uncorrelated, 
or  even  better,  where  closeness  in  the  (fr  '  implies  closeness  in  the  0^  ^,  we  can  get  correct  recall  in  spite  of 
the  other  ;0^  '  being  similar  to  (j). 

Our  next  goal  is  to  show  that  the  system  converges  to  a  stable  state.    To  accomplish  this,  following 

i 

Kosko  aftd  Hopfield  ,  we  investigate  how  a  Lyapunov  function  for  the  system  changes  in  time.  The  func- 
tion is  defined: 

I  '  (55) 

(We  write  0^  ambiguously  for  ^,1.)  This  formula  can  be  understood  by  observing  that  xi}-'^(t>  measures  the 
correlation  between  ip  and  'I'^,  and  <f>'ip^  measures  the  correlation  between  4>  and  rp'^.  Thus  E  represents 
the  "energy"  of  the  system,  which  decreases  as  these  correlations  increase.    Eq.  55  can  be  simplified  to: 

\  ^  (56) 

i 
where  we  write  ijj'^O  ~  il;-'id>  =  rp^'cp.    We  now  use  this  energy  function  to  prove  some  important  results. 

Theorem  (Monotonicity  of  Change  of  Energy):  Changes  of  state  always  decrease  the  energy.  That 
is,  A£^(^)  <  0  and  /1E[4>)  <  0. 

Proof:   The  change  m  energy  resulting  from  alteration  of  xl^  to  ip'  is: 

A^(^)    =    E{jp\  <p)  -  E{tp,  (j)) 

'      =    [-xi)'<i/<t>  ^xl}'-d^+  (b-e^)    -    (-^*0  +  v^-^i  +  <?J-6>2) 

AE{tp)   =   -A0  •  (*0  -  e,)  ^^^^ 

The  analysis  for  the  change  in  ^  is  analogous.    Expanding  the  inner  product  in  Eq.  57  yields: 
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*^"»  (58) 


We  next  investigate  the  integrand. 

The  updating  equation  for  rp  (Eq.  48)  yields  the  following  possibilities: 

'a0^  €  (0,  +2}  if '^jci  -  e^>  0 

Atpt  =  0  if -^2  ^  ^t'^  ^  ^1 


(59) 
Atpi  e  {0,  -2}    if  ^;(i>  +02<O 


Since  ^j'0  —  ^2  <  '^j*^  +  ^2  ^  ^'  note  that  in  all  three  cases: 


(60) 


Furthermore,  note  that  if  A^;  ^  0  then  this  inequality  is  strict.  Observe  that  since  A^  ^  0,  this  strict 
inequality  must  hold  for  a  set  of  f  £  (12  with  nonzero  measure.  This  implies  the  strict  monotonicity  of 
AE{tp): 


AE{rp)   =   -f^^Aip,{yif,-4>-e,)dt   <  0 


(61) 


Hence,  until  the  system  stabilizes,  every  (p  ^  ip  step  of  the  iteration  decreases  the  energy.    Analogous  rea- 
soning shows  that  the  energy  also  decreases  with  the  xj)  -*  (t>  steps.     ■ 

We  have  shown  that  every  step  decreases  the  energy.    We  now  show  that  the  energy  cannot  decrease 
forever,  because  it  has  a  lower  bound. 

Theorem  (Energy  Bounded  From  Below):    The  energy  is  bounded  from  below  by  a  quantity  that 
depends  only  on  ^.  -  • 

Proof:    Recall  (Eq.  56)  that  the  energy  is  defined:    E{tp,  (f>)    ==    -xp'^(j)  +  tp-O-^  +  (p-Q^.    Since  ^,  6^  and 
$2  are  fixed,  there  are  clearly  bipolar  xj)  and  (j)  that  minimize  this  expression. 

We  can  derive  a  formula  for  a  explicit  lower  bound  as  follows.    Since  xp ^  ^  \  and  (f)^  ^  1,  we  have  the 
inequality: 
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^  r  r  Mdsdt 

=  A^|ni||n2l 

where  M  =  max  ^^j.   Therefore, 

3t 

E{tp,  <t>)    =    -t/;*(;i  +  0-(?l  +  <?i-^2 

^    -M|  fill  I  f72|   +  ^il  02!   +  ^2!  ^il 

This  is  our  lower  bound.    It  is  fixed  since  it  depends  only  on  '^.     ■ 

The  preceding  theorems  show  that  the  bidirectional  associative  field  memory  approaches  one  of  the 
states  characterizing  a  stored  associative  pair. 

5.  CONCLUSIONS 

We  have  argued  that  AI  is  moving  by  necessity  into  a  new  phase  that  recognizes  the  role  of  nonproposi- 
tional  knowledge  in  intelligent  behavior.  We  also  argued  that  the  "new"  AI  must  make  use  of  massive 
parallelism  to  achieve  its  ends.  We  proposed  a  definition  of  massive  parallelism,  namely  that  the  number 
of  processing  elements  can  be  taken  as  a  continuous  quantity.  We  believe  that  this  definition  will 
encourage  the  development  of  the  necessary  theoretical  basis  for  neurocomputers,  optical  computers,  molec- 
ular computers,  and  a  new  generation  of  analog  computers.  We  claimed  that  these  computing  technologies 
can  be  profitably  viewed  as  field  computers,  computers  that  operate  on  entire  fields  of  data  in  parallel.  We 
discussed  the  importance  of  general  purpose  field  computers,  and  related  them  to  universal  field  computers. 
This  was  followed  by  a  theoretical  model  of  field  computation,  including  the  derivation  of  several  generali- 
zations of  Taylor's  theorem  for  field  transformations.  These  theorems  provide  one  theoretical  basis  for 
universal  field  computers.  Finally,  we  illustrated  our  theory  of  field  computation  by  analyzing  a  continu- 
ous field  version  of  Kosko's  bidirectional  associative  memory 
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